-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Jupyter Kernel for SystemDS #1998
base: main
Are you sure you want to change the base?
Conversation
|
||
- Java JDK 11 or later | ||
- Maven | ||
- Git | ||
- Jupyter Notebook or JupyterLab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention the operating system this README is written for.
git clone https://github.com/kubieren/SystemDSKernel.git | ||
cd SystemDSKernel/kernelsds | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the code from the main SystemDS repo, which will be available after this PR is merged.
//JupyterSocket.JUPYTER_LOGGER.setLevel(Level.WARNING); | ||
JupyterSocket.JUPYTER_LOGGER.setLevel(Level.WARNING); | ||
|
||
KernelConnectionProperties connProps = KernelConnectionProperties.parse(contents); | ||
JupyterConnection connection = new JupyterConnection(connProps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add detailed comments for readability and debuggability, especially for the code that configure flags.
.preferLong() | ||
//Keywords from a great poem at https://stackoverflow.com/a/12114140 | ||
.withKeywords("let", "this", "long", "package", "float") | ||
.withKeywords("goto", "private", "class", "if", "short") | ||
.withKeywords("while", "protected", "with", "debugger", "case") | ||
.withKeywords("continue", "volatile", "interface") | ||
.withKeywords("instanceof", "super", "synchronized", "throw") | ||
.withKeywords("extends", "final", "export", "throws") | ||
.withKeywords("try", "import", "double", "enum") | ||
.withKeywords("false", "boolean", "abstract", "function") | ||
.withKeywords("implements", "typeof", "transient", "break") | ||
.withKeywords("void", "static", "default", "do") | ||
.withKeywords("switch", "int", "native", "new") | ||
.withKeywords("else", "delete", "null", "public", "var") | ||
.withKeywords("in", "return", "for", "const", "true", "char") | ||
.withKeywords("finally", "catch", "byte") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the keywords for dml? If so, we need to clean this up.
matrix[i][j] = Double.parseDouble(values[j].trim()); // Trim to remove any leading or trailing spaces. | ||
} catch (NumberFormatException e) { | ||
// Handle the case where the string cannot be parsed as a double. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These parsing logic seem to make strong assumptions regarding the code statements. I like to see some examples of complex dml code. You can use our builtins.
for (String out: outputArray | ||
) { | ||
//System.out.println("out: "+out); | ||
expr = expr + ";\nwrite("+out+", './tmp/"+out+"');" ; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing the output objects to disk after each cell is extremely inefficient. We need to utilize the jmlc framework to maintain the variables in-memory.
import org.apache.sysds.api.jmlc.*; | ||
import org.apache.sysds.common.Types; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to avoid wildcard imports.
Add Jupyter Kernel for SystemDS
Description
This pull request introduces a new Jupyter kernel for SystemDS, enabling seamless integration and execution of SystemDS scripts within Jupyter notebooks. The addition aims to provide a more interactive and user-friendly environment for data scientists and researchers working with SystemDS for machine learning and statistical data processing.
Benefits
How to Use
Please refer to the updated README in the
scripts/jupyterkernel/
directory for detailed instructions on installing and using the SystemDS kernel in your Jupyter environment.